Clustering with EM: Complex Models vs. Robust Estimation
نویسندگان
چکیده
Clustering multivariate data that are contaminated by noise is a complex issue, particularly in the framework of mixture model estimation because noisy data can significantly affect the parameters estimates. This paper addresses this problem with respect to likelihood maximization using the Expectation-Maximization algorithm. Two different approaches are compared. The first one consists in defining mixture models that take into account noise. The second one is based of robust estimation of the model parameters in the maximization step of EM. Both have been tested separately, then jointly. Finally, a hybrid model is proposed. Results on artificial data are given and discussed.
منابع مشابه
A robust wavelet based profile monitoring and change point detection using S-estimator and clustering
Some quality characteristics are well defined when treated as response variables and are related to some independent variables. This relationship is called a profile. Parametric models, such as linear models, may be used to model profiles. However, in practical applications due to the complexity of many processes it is not usually possible to model a process using parametric models.In these cas...
متن کاملA robust EM clustering algorithm for Gaussian mixture models
Clustering is a useful tool for finding structure in a data set. The mixture likelihood approach to clustering is a popular clustering method, in which the EM algorithm is the most used method. However, the EM algorithm for Gaussian mixture models is quite sensitive to initial values and the number of its components needs to be given a priori. To resolve these drawbacks of the EM, we develop a ...
متن کاملRobust Method for E-Maximization and Hierarchical Clustering of Image Classification
We developed a new semi-supervised EM-like algorithm that is given the set of objects present in eachtraining image, but does not know which regions correspond to which objects. We have tested thealgorithm on a dataset of 860 hand-labeled color images using only color and texture features, and theresults show that our EM variant is able to break the symmetry in the initial solution. We compared...
متن کاملUnsupervised learning of regression mixture models with unknown number of components
Regression mixture models are widely studied in statistics, machine learning and data analysis. Fitting regression mixtures is challenging and is usually performed by maximum likelihood by using the expectation-maximization (EM) algorithm. However, it is well-known that the initialization is crucial for EM. If the initialization is inappropriately performed, the EM algorithm may lead to unsatis...
متن کاملADAPTIVE NEURO FUZZY INFERENCE SYSTEM BASED ON FUZZY C–MEANS CLUSTERING ALGORITHM, A TECHNIQUE FOR ESTIMATION OF TBM PENETRATION RATE
The tunnel boring machine (TBM) penetration rate estimation is one of the crucial and complex tasks encountered frequently to excavate the mechanical tunnels. Estimating the machine penetration rate may reduce the risks related to high capital costs typical for excavation operation. Thus establishing a relationship between rock properties and TBM pe...
متن کامل